Covid19 Japanが独自に収集している陽性者単位のデータ(個票データ)。ソースとデータは全てGitHubにて公開されており、データはJSON形式。「レコード数 \(\neq\) 累計陽性者数」であることに注意。
Covid19 JapanがGitHubで公開しているデータは前述のようにJSON形式であり、最新データはlatest.jsonファイルで示されている。このため、読み込む際はひと工夫必要。
陽性者単位の個票データ。
# path <- "https://raw.githubusercontent.com/reustle/covid19japan-data/master/docs/patient_data/"
#
# df <- path %>%
# paste0("latest.json") %>%
# readr::read_lines() %>%
# paste0(path, .) %>%
# jsonlite::fromJSON()
df <- "https://raw.githubusercontent.com/reustle/covid19japan-data/master/docs/patient_data/latest.json" %>%
jsonlite::fromJSON()
df
死亡者数や重症者数などの推移データはsummaryフォルダ内のJSON形式ファイルにまとめられている。読み込むと分かるがリスト型で、その中データフレームが含まれる形式である。
summaryフォルダの他にsummary_minフォルダというフォルダがあるが、summary_minフォルダ内のJSONファイルは単に改行を省略して小さくしたファイル。
# path <- "https://raw.githubusercontent.com/reustle/covid19japan-data/master/docs/summary/"
#
# df_s <- path %>%
# paste0("latest.json") %>%
# readr::read_lines() %>%
# paste0(path, .) %>%
# jsonlite::fromJSON()
df_s <- "https://raw.githubusercontent.com/reustle/covid19japan-data/master/docs/summary/latest.json" %>%
jsonlite::fromJSON()
df_s %>% summary()
## Length Class Mode
## prefectures 27 data.frame list
## regions 12 data.frame list
## daily 37 data.frame list
## updated 1 -none- character
三つのデータフレームと一つのベクトル(更新日時)から構成されている。データフレームは上から順に都道府県別、地方別、日次となっているが、Lengthを見てわかるようにそれぞれに含まれる集計データが異なっている。
更新日時($updated)における都道府県単位での累積値。厚生労働省がオープンデータから除いている空港検疫・ダイヤモンドプリンセス・長崎クルーズ船・その他が含まれるので全51区分になっている。
df_s$prefectures
陽性者・死亡者などの時系列集計データがネストされて格納されている。日付はネストされていないので、各項目に対するstartDateの項を参照すること。
| 項目 | 内容 | 備考 |
|---|---|---|
| dailyConfirmedCount | 陽性者数 | 単日 |
| dailyConfirmedStartDate | 陽性者数のカウント開始日 | 区分により開始日が異なる |
| dailyDeceasedCount | 死亡者数 | 単日 |
| dailyDeceasedStartDate | 死亡者数のカウント開始日 | 区分により開始日が異なる |
| dailyRecoveredCumulative | 快復者数 | 累計 |
| dailyRecoveredStartDate | 快復者数のカウント開始日 | 区分により開始日が異なる |
| dailyActive | 治療者数1 | 単日 |
| dailyActiveStartDate | 治療者数のカウント開始日 | 区分により開始日が異なる |
1 陽性者数から死亡者数と快復者数を引いた数値を治療者数としている
更新日次時点における地方区分単位での累積値。陽性者の時系列集計データが都道府県単位データと同様にネストで格納されているが、死亡者・快復者・治療者のデータは含まれていない。
なお、時系列データの合計値と累積項の値が一致しない場合がある。
df_s$regions
df_s$regions$confirmed[1]
## [1] 93072
df_s$regions$dailyConfirmedCount[[1]] %>% sum()
## [1] 102013
個票データを日次で集計したもの。日付を見れば分かる通り暗黙の欠落を含んでいる。
df_s$daily
集計データの更新日時。
df_s$updated
## [1] "2020-12-19T22:51:45+09:00"
新型コロナウイルス対策病床オープンデータのデータも用意しておく。
if (googlesheets4::gs4_has_token()) {
beds_by_pref <- "https://docs.google.com/spreadsheets/d/1u0Ul8TgJDqoZMnqFrILyXzTHvuHMht1El7wDZeVrpp8" %>%
googlesheets4::read_sheet() %>%
dplyr::arrange(dplyr::desc(`発表日`)) %>%
dplyr::distinct(`自治体名`, .keep_all = TRUE) %>%
dplyr::rename(pref = `自治体名`, beds = `新型コロナウイルス対策感染症病床数`,
date = `発表日`) %>%
dplyr::mutate(beds = as.integer(beds), date = lubridate::as_date(date))
beds_by_pref
}
新型コロナ関連のニュース
news <- "https://gist.githubusercontent.com/k-metrics/76fea197fa32466a2f99ff59f721b98a/raw/af750e1a8a6c9331229c6139e1aec72f56833bbf/covid19_news.csv" %>%
readr::read_csv() %>%
dplyr::filter(area == "日本")
news
各変量(フィーチャー)を適切な形式に変換し、地域区分でも分析できるように都道府県データと結合することで、ベースとなるデータセットを作成する。なお、都道府県以外で報告されたレコードを除いている。
x <- df %>%
dplyr::select(patientId, date = dateAnnounced, gender,
pref = detectedPrefecture, patientStatus, knownCluster,
confirmedPatient, charterFlightPassenger,
cruisePassengerDisembarked, ageBracket,
deceasedDate, deceasedReportedDate) %>%
dplyr::filter(confirmedPatient == TRUE) %>%
dplyr::mutate(date = lubridate::as_date(date),
gender = forcats::as_factor(gender),
patientStatus = forcats::as_factor(patientStatus),
cluster = dplyr::if_else(!is.na(knownCluster), TRUE, FALSE),
ageBracket = forcats::as_factor(ageBracket),
deceasedDate = lubridate::as_date(deceasedDate),
deceasedReportedDate = lubridate::as_date(deceasedReportedDate)) %>%
dplyr::left_join(prefs, by = c("pref" = "pref")) %>%
dplyr::select(-`推計人口`, -pref) %>%
dplyr::rename(pref = `都道府県`, region = `八地方区分`) %>%
tidyr::drop_na(pref)
x
最初に陽性者をキーに集計する。
全国の累計陽性者数と推計人口[千人]、ならびに、人口千人あたりの累計陽性者数。
r_by_all <- x %>%
dplyr::filter(!is.na(pref)) %>%
dplyr::summarise(n = n()) %>%
dplyr::bind_cols(prefs %>% dplyr::summarise(population = sum(`推計人口`))) %>%
dplyr::mutate(rate = round(n / population, 2))
r_by_all %>%
dplyr::rename(`累計陽性者数[人]` = n, `推計人口[千人]` = population,
`人口千人あたりの累計陽性者数` = rate)
次に地方別の累計陽性者数と推計人口[千人]、ならびに、人口千人あたりの累計陽性者数。
region <- prefs %>%
dplyr::group_by(`八地方区分`) %>%
dplyr::summarise(population = sum(`推計人口`)) %>%
dplyr::rename(region = `八地方区分`)
r_by_region <- x %>%
dplyr::group_by(region) %>%
dplyr::summarise(n = n()) %>%
tidyr::drop_na() %>%
dplyr::left_join(region, by = c("region" = "region")) %>%
dplyr::select(region, n, population) %>%
dplyr::mutate(rate = round(n / population, 2))
r_by_region %>%
dplyr::rename(`地方` = region,
`累計陽性者数[人]` = n, `推計人口[千人]` = population,
`人口千人あたりの累計陽性者数` = rate)
上表を可視化する。グレーの破線は切片ゼロで傾きが全国の人口千人あたりの累計陽性者数(1.54)。
r_by_region %>%
dplyr::rename(key = region) %>%
ggplot2::ggplot(ggplot2::aes(x = population, y = n) ) +
ggplot2::geom_abline(slope = r_by_all$rate, intercept = 0,
colour = "gray", linetype = "dashed") +
ggplot2::geom_point(ggplot2::aes(colour = key)) +
ggrepel::geom_text_repel(ggplot2::aes(label = key, colour = key)) +
ggplot2::theme(legend.position = 'none') +
ggplot2::labs(title = paste0("推計人口と累計陽性者数 @", datetime),
subtitle = subtitle, caption = caption,
x = "推計人口[千人]", y = "累計陽性者数[人]")
同様に都道府県別の累計陽性者数と推計人口[千人]、ならびに、人口千人あたりの累計陽性者数。任意の列でソートできるようにしてある。
r_by_pref <- x %>%
dplyr::group_by(pref) %>%
dplyr::summarise(n = n()) %>%
tidyr::drop_na() %>%
dplyr::left_join(prefs, by = c("pref" = "都道府県")) %>%
dplyr::select(pref, n, population = `推計人口`) %>%
dplyr::mutate(rate = round(n / population, 2))
r_by_pref %>%
dplyr::rename(`都道府県` = pref,
`累計陽性者数[人]` = n, `推計人口[千人]` = population,
`人口千人あたりの累計陽性者数` = rate) %>%
tibble::rowid_to_column("No") %>%
DT::datatable()
上表を可視化する。グレーの破線は切片ゼロで傾きが全国の人口千人あたりの累計陽性者数(1.54)。
r_by_pref %>%
dplyr::rename(key = pref) %>%
ggplot2::ggplot(ggplot2::aes(x = population, y = n) ) +
ggplot2::geom_abline(slope = r_by_all$rate, intercept = 0,
colour = "gray", linetype = "dashed") +
ggplot2::geom_point(ggplot2::aes(colour = key)) +
ggrepel::geom_text_repel(ggplot2::aes(label = key, colour = key)) +
ggplot2::theme(legend.position = 'none') +
ggplot2::labs(title = paste0("推計人口と累計陽性者数 @", datetime),
subtitle = subtitle, caption = caption,
x = "推計人口[千人]", y = "累計陽性者数[人]")
推計人口が550万人未満の都道府県のみ抽出する。グレーの破線は上図と同様。
r_by_pref %>%
dplyr::filter(population < 5500) %>%
dplyr::rename(key = pref) %>%
ggplot2::ggplot(ggplot2::aes(x = population, y = n) ) +
ggplot2::geom_abline(slope = r_by_all$rate, intercept = 0,
colour = "gray", linetype = "dashed") +
ggplot2::geom_point(ggplot2::aes(colour = key)) +
ggrepel::geom_text_repel(ggplot2::aes(label = key, colour = key)) +
ggplot2::theme(legend.position = 'none') +
ggplot2::labs(title = paste0("推計人口と累計陽性者数 @", datetime),
subtitle = subtitle, caption = caption,
x = "推計人口[千人]", y = "累計陽性者数[人]")
全国の日次単位の陽性者数、前日差、累計、移動平均を求める。
x_by_all <- x %>%
dplyr::group_by(date) %>%
dplyr::summarise(n = n()) %>%
tidyr::complete(date = seq.Date(from = min(date), to = max(date), by = "day"),
fill = list(n = 0L)) %>%
dplyr::mutate(diff = lagdiff(n), cum = cumsum(n), ma7 = ma7(n), ma28 = ma28(n))
x_by_all %>%
dplyr::select(`発表日` = date, `陽性者数` = n, `前日差` = diff,
`累計陽性者数` = cum, `移動平均(7日)` = ma7)
上表を可視化する。
# 祝日ファイルは以下をダウンロードしておく(SSLがエラーになるので自動処理できない)
# "https://www8.cao.go.jp/chosei/shukujitsu/syukujitsu.csv"
# 祝日判定関数
source("https://raw.githubusercontent.com/logics-of-blue/website/master/010_forecast/20190714_R%E8%A8%80%E8%AA%9E%E3%81%AB%E3%81%8A%E3%81%91%E3%82%8B%E6%97%A5%E6%9C%AC%E3%81%AE%E7%A5%9D%E6%97%A5%E5%88%A4%E5%AE%9A/jholiday.R", encoding="utf-8")
sec_scale <- 100
emergency <- news %>%
dplyr::filter(category == "緊急事態")
goto <- news %>%
dplyr::filter(category == "GoTo")
x_by_all %>%
dplyr::mutate(hday = is.jholiday(target_date = date,
holiday_source = "./Covid19/syukujitsu.csv"),
wday = lubridate::wday(date, week_start = 1),
wday = dplyr::if_else(wday > 5, TRUE, FALSE),
wday = dplyr::if_else(hday | wday, 3000L, NA_integer_)) %>%
ggplot2::ggplot(ggplot2::aes(x = date)) +
# ggplot2::geom_bar(ggplot2::aes(y = wday), stat = "identity", width = 1.0,
# fill = "red", alpha = 0.1) +
# ggplot2::geom_vline(ggplot2::aes(xintercept = date), data = emergency,
# colour = "dark blue", linetype = "dashed", size = 0.15) +
ggplot2::geom_vline(ggplot2::aes(xintercept = date), data = goto,
colour = "magenta", linetype = "dashed", size = 0.25) +
ggplot2::geom_bar(ggplot2::aes(y = n), stat = "identity", width = 1.0,
fill = "dark gray", alpha = 1.0) +
ggplot2::geom_line(ggplot2::aes(y = ma7), linetype = "dashed",
colour = "dark green", size = 0.5) +
ggplot2::geom_line(ggplot2::aes(y = cum / sec_scale),
colour = "dark green", size = 1.0) +
ggplot2::labs(title = paste0("【全国】陽性者数の推移(単日) @", datetime),
subtitle = subtitle, caption = caption,
x = "", y = "") +
# ggrepel::geom_label_repel(ggplot2::aes(x = date, y = 2500, label = news),
# size = 2.0, data = emergency) +
ggrepel::geom_label_repel(ggplot2::aes(x = date, y = 3000, label = news),
size = 2.5, data = goto, colour = "magenta") +
ggplot2::scale_y_continuous(
name = "陽性者数・移動平均(破線)",
sec.axis = ggplot2::sec_axis(~ . * sec_scale,
name = "累積陽性者数(折線)")
)
x_by_all %>%
ggplot2::ggplot(ggplot2::aes(x = date)) +
ggplot2::geom_line(ggplot2::aes(y = diff), colour = "dark green", alpha = 0.5) +
ggplot2::labs(title = paste0("【全国】陽性者数の前日差 @", datetime),
subtitle = subtitle, caption = caption,
x = "", y = "前日差")
同様に地方別の日次単位の陽性者数、前日差、累計、移動平均を求める。
x_by_region <- x %>%
dplyr::group_by(date, region) %>%
dplyr::summarise(n = n()) %>%
dplyr::ungroup() %>%
tidyr::pivot_wider(names_from = region, values_from = n, values_fill = 0L) %>%
tidyr::complete(date = seq.Date(from = min(date), to = max(date), by = "day")) %>%
tidyr::pivot_longer(cols = -date, names_to = "region", values_to = "n") %>%
tidyr::replace_na(replace = list(n = 0L)) %>%
dplyr::group_by(region) %>%
tidyr::nest() %>%
dplyr::mutate(diff = purrr::map(data, ~ lagdiff(.$n)),
cum = purrr::map(data, ~ cumsum(.$n)),
ma7 = purrr::map(data, ~ ma7(.$n)),
ma28 = purrr::map(data, ~ ma28(.$n))) %>%
tidyr::unnest() %>%
dplyr::left_join(prefs %>% dplyr::distinct(`八地方区分`), .,
by = c("八地方区分" = "region")) %>%
dplyr::mutate(region = forcats::fct_inorder(`八地方区分`)) %>%
dplyr::arrange(date)
x_by_region %>%
dplyr::filter(date == max(date)) %>%
dplyr::mutate(ma7 = round(ma7, 1)) %>%
dplyr::select(`地方` = region,
`発表日` = date, `陽性者数` = n, `前日差` = diff,
`陽性者累計` = cum, `移動平均(7日)` = ma7)
x_by_region %>%
dplyr::select(`地方` = region,
`発表日` = date, `陽性者数` = n, `前日差` = diff,
`陽性者累計` = cum, `移動平均(7日)` = ma7)
上表を可視化する。
x_by_region %>%
ggplot2::ggplot(ggplot2::aes(x = date, y = n)) +
ggplot2::geom_bar(ggplot2::aes(y = n, fill = region), stat = "identity",
width = 1.0, alpha = 0.5) +
ggplot2::labs(title = paste0("【地方別】陽性者数の推移(単日) @", datetime),
subtitle = subtitle, caption = caption,
x = "", y = "陽性者数")
x_by_region %>%
ggplot2::ggplot(ggplot2::aes(x = date, y = ma7, colour = region)) +
ggplot2::geom_vline(ggplot2::aes(xintercept = date), data = goto,
colour = "magenta", linetype = "dashed", size = 0.25) +
ggrepel::geom_label_repel(ggplot2::aes(x = date, y = 1000, label = news),
size = 2.5, data = goto, colour = "magenta") +
ggplot2::geom_line(size = 1) +
ggplot2::theme(legend.position = 'none') +
ggplot2::labs(title = paste0("【地方別】移動平均(7日) @", datetime),
subtitle = subtitle, caption = caption,
x = "", y = "陽性者数") +
ggrepel::geom_text_repel(ggplot2::aes(label = region),
data = subset(x_by_region, date == max(date)),
nudge_x = 30, segment.alpha = 0.5, size = 4) +
ggplot2::lims(x = c(min(x_by_region$date),
max(x_by_region$date) + 45))
x_by_region %>%
ggplot2::ggplot(ggplot2::aes(x = date, y = cum, colour = region)) +
ggplot2::geom_line(size = 1) +
ggplot2::theme(legend.position = 'none') +
ggplot2::labs(title = paste0("【地方別】累計陽性者数 @", datetime),
subtitle = subtitle, caption = caption,
x = "", y = "累計陽性者数") +
ggrepel::geom_text_repel(ggplot2::aes(label = region),
data = subset(x_by_region, date == max(date)),
nudge_x = 30, segment.alpha = 0.5, size = 4) +
ggplot2::lims(x = c(min(x_by_region$date),
max(x_by_region$date) + 45))
地方単位で可視化。
sec_scale <- 20
ncol <- 2
x_by_region %>%
dplyr::rename(key = region) %>%
ggplot2::ggplot(ggplot2::aes(x = date)) +
ggplot2::geom_bar(ggplot2::aes(y = n, fill = key), stat = "identity",
alpha = 0.5, width = 1.0) +
ggplot2::geom_line(ggplot2::aes(y = ma7, colour = key),
linetype = "dotted", size = 0.5) +
ggplot2::geom_line(ggplot2::aes(y = cum / sec_scale, colour = key)) +
ggplot2::facet_wrap(~ key, ncol = ncol) +
ggplot2::theme(legend.position = 'none') +
ggplot2::labs(title = paste0("Fixed scale @", datetime),
subtitle = subtitle, caption = caption,
x = "", y = "") +
ggplot2::scale_y_continuous(
name = "陽性者数・移動平均(点線)",
sec.axis = ggplot2::sec_axis(~ . * sec_scale,
name = "陽性者累計(実線)")
)
傾向が見えるように縦軸をフリースケールとする。
sec_scale <- 20
ncol <- 2
x_by_region %>%
dplyr::rename(key = region) %>%
ggplot2::ggplot(ggplot2::aes(x = date)) +
ggplot2::geom_bar(ggplot2::aes(y = n, fill = key), stat = "identity",
alpha = 0.5, width = 1.0) +
ggplot2::geom_line(ggplot2::aes(y = ma7, colour = key),
linetype = "dashed", size = 0.5) +
ggplot2::geom_line(ggplot2::aes(y = cum / sec_scale, colour = key)) +
ggplot2::facet_wrap(~ key, ncol = ncol, scales = "free_y") +
ggplot2::theme(legend.position = 'none') +
ggplot2::labs(title = paste0("Free Y scale @", datetime),
subtitle = subtitle, caption = caption,
x = "", y = "") +
ggplot2::scale_y_continuous(
name = "陽性者数・移動平均(破線)",
sec.axis = ggplot2::sec_axis(~ . * sec_scale,
name = "陽性者累計(実線)")
)
同様に都道府県別の日次単位の陽性者数、前日差、累計、移動平均を求める。
x_by_pref <- x %>%
dplyr::group_by(date, pref) %>%
dplyr::summarise(n = n()) %>%
dplyr::ungroup() %>%
tidyr::pivot_wider(names_from = pref, values_from = n, values_fill = 0L) %>%
tidyr::complete(date = seq.Date(from = min(date), to = max(date), by = "day")) %>%
tidyr::pivot_longer(cols = -date, names_to = "pref", values_to = "n") %>%
tidyr::replace_na(replace = list(n = 0L)) %>%
dplyr::group_by(pref) %>%
tidyr::nest() %>%
dplyr::mutate(diff = purrr::map(data, ~ lagdiff(.$n)),
cum = purrr::map(data, ~ cumsum(.$n)),
ma7 = purrr::map(data, ~ ma7(.$n)),
ma28 = purrr::map(data, ~ ma28(.$n))) %>%
tidyr::unnest() %>%
dplyr::left_join(prefs, ., by = c("都道府県" = "pref")) %>%
dplyr::mutate(pref = forcats::fct_inorder(`都道府県`)) %>%
dplyr::arrange(date)
x_by_pref %>%
dplyr::filter(date == max(date)) %>%
dplyr::mutate(ma7 = round(ma7, 1)) %>%
dplyr::select(`都道府県` = pref,
`発表日` = date, `陽性者数` = n, `前日差` = diff,
`陽性者累計` = cum, `移動平均(7日)` = ma7) %>%
DT::datatable()
x_by_pref %>%
dplyr::select(`都道府県` = pref,
`発表日` = date, `陽性者数` = n, `前日差` = diff,
`陽性者累計` = cum, `移動平均(7日)` = ma7)
上表を可視化する。
sec_scale <- 100
ncol <- 5
datetime <- lubridate::as_datetime(df_s$updated, tz = "Japan")
x_by_pref %>%
dplyr::rename(key = pref) %>%
ggplot2::ggplot(ggplot2::aes(x = date)) +
ggplot2::geom_bar(ggplot2::aes(y = n, fill = key), stat = "identity",
alpha = 0.25, width = 1.0) +
ggplot2::geom_line(ggplot2::aes(y = ma7, colour = key),
linetype = "solid", size = 0.25) +
ggplot2::geom_line(ggplot2::aes(y = cum / sec_scale, colour = key)) +
ggplot2::facet_wrap(~ key, ncol = ncol) +
ggplot2::theme(legend.position = 'none') +
ggplot2::labs(title = paste0("Fixed scale @", datetime),
subtitle = subtitle, caption = caption,
x = "", y = "") +
ggplot2::scale_y_continuous(
name = "陽性者数・移動平均(細線)",
sec.axis = ggplot2::sec_axis(~ . * sec_scale,
name = "累計陽性者数(折線)")
)
傾向が見えるように縦軸をフリースケールとする。
x_by_pref %>%
dplyr::rename(key = pref) %>%
ggplot2::ggplot(ggplot2::aes(x = date)) +
ggplot2::geom_bar(ggplot2::aes(y = n, fill = key), stat = "identity",
alpha = 0.35, width = 1.0) +
ggplot2::geom_line(ggplot2::aes(y = ma7, colour = key),
linetype = "solid", size = 0.25) +
ggplot2::geom_line(ggplot2::aes(y = cum / sec_scale, colour = key)) +
ggplot2::facet_wrap(~ key, ncol = ncol, scales = "free_y") +
ggplot2::theme(legend.position = 'none') +
ggplot2::labs(title = paste0("Free Y scale @", datetime),
subtitle = subtitle, caption = caption,
x = "", y = "") +
ggplot2::scale_y_continuous(
name = "陽性者数・移動平均(細線)",
sec.axis = ggplot2::sec_axis(~ . * sec_scale,
name = "累計陽性者数(折線)")
)
厚生労働省のデータと乖離がある。
都道府県別の日次単位の死亡者数、前日差、累計、移動平均(7日)を求める。
start <- df_s$prefectures %>%
dplyr::select(pref = name, date = dailyDeceasedStartDate) %>%
dplyr::left_join(prefs, by = c("pref" = "pref")) %>%
dplyr::arrange(pcode) %>%
tidyr::drop_na(pcode) %>%
dplyr::select(date, pref = `都道府県`) %>%
dplyr::distinct(date) %>%
.$date %>% lubridate::as_date()
d_by_prefs <- df_s$prefectures %>%
dplyr::select(deceased = dailyDeceasedCount, pref = name) %>%
dplyr::left_join(prefs, by = c("pref" = "pref")) %>%
tidyr::drop_na(pcode) %>%
dplyr::select(pref = `都道府県`, deceased) %>%
tidyr::unnest(deceased) %>%
tidyr::pivot_wider(names_from = pref, values_from = deceased) %>%
tidyr::unnest() %>%
dplyr::mutate(date = seq.Date(from = start, to = start + nrow(.) - 1,
by = "day")) %>%
dplyr::select(date, dplyr::everything()) %>%
tidyr::pivot_longer(col = -date, names_to = "pref", values_to = "n") %>%
dplyr::group_by(pref) %>%
tidyr::nest() %>%
dplyr::mutate(diff = purrr::map(data, ~ lagdiff(.$n)),
cum = purrr::map(data, ~ cumsum(.$n)),
ma7 = purrr::map(data, ~ ma7(.$n))) %>%
tidyr::unnest() %>%
dplyr::left_join(prefs, ., by = c("都道府県" = "pref")) %>%
dplyr::mutate(pref = forcats::fct_inorder(`都道府県`)) %>%
dplyr::select(date, pref, n, diff, cum, ma7) %>%
dplyr::arrange(date)
d_by_prefs
sec_scale <- 100
ncol <- 5
datetime <- lubridate::as_datetime(df_s$updated, tz = "Japan")
d_by_prefs %>%
dplyr::rename(key = pref) %>%
ggplot2::ggplot(ggplot2::aes(x = date)) +
ggplot2::geom_bar(ggplot2::aes(y = n, fill = key), stat = "identity",
alpha = 0.25, width = 1.0) +
ggplot2::geom_line(ggplot2::aes(y = ma7, colour = key),
linetype = "solid", size = 0.25) +
ggplot2::geom_line(ggplot2::aes(y = cum / sec_scale, colour = key)) +
ggplot2::facet_wrap(~ key, ncol = ncol, scales = "free_y") +
ggplot2::theme(legend.position = 'none') +
ggplot2::labs(title = paste0("Free Y scale @", datetime),
subtitle = subtitle, caption = caption,
x = "", y = "") +
ggplot2::scale_y_continuous(
name = "陽性者数・移動平均(細線)",
sec.axis = ggplot2::sec_axis(~ . * sec_scale,
name = "累計陽性者数(折線)")
)
集計データ$regionsには死亡者数の日次データが存在しないため$prefecturesのデータから計算する。
d_by_region <- d_by_prefs %>%
dplyr::select(date, pref = pref, n) %>%
dplyr::left_join(prefs, by = c("pref" = "都道府県")) %>%
tidyr::drop_na(pcode) %>%
dplyr::group_by(date, `八地方区分`) %>%
dplyr::summarise(n = sum(n)) %>%
dplyr::ungroup() %>%
dplyr::rename(region = `八地方区分`) %>%
dplyr::group_by(region) %>%
tidyr::nest() %>%
dplyr::mutate(diff = purrr::map(data, ~ lagdiff(.$n)),
cum = purrr::map(data, ~ cumsum(.$n)),
ma7 = purrr::map(data, ~ ma7(.$n))) %>%
tidyr::unnest() %>%
dplyr::arrange(date)
d_by_region
rpd_by_all <- d_by_region %>%
dplyr::group_by(region) %>%
dplyr::summarise(d = sum(n)) %>%
dplyr::left_join(r_by_region, ., by = c("region")) %>%
dplyr::select(region, positive = n, deceased = d, population) %>%
dplyr::select(-region) %>%
dplyr::summarise_all(sum) %>%
dplyr::mutate(p_rate = round(positive / population, 2),
d_rate = round(deceased / positive, 2))
rpd_by_all %>%
dplyr::rename(`陽性者数` = positive, `死亡者数` = deceased,
`推計人口` = population, `人口千人あたりの陽性者比率` = p_rate,
`陽性者に対する死亡者比率` = d_rate)
rpd_by_region <- d_by_region %>%
dplyr::group_by(region) %>%
dplyr::summarise(d = sum(n)) %>%
dplyr::left_join(r_by_region, ., by = c("region")) %>%
dplyr::select(region, positive = n, deceased = d, population, p_rate = rate) %>%
dplyr::mutate(d_rate = round(deceased / positive, 2))
rpd_by_region %>%
dplyr::rename(`陽性者数` = positive, `死亡者数` = deceased,
`推計人口` = population, `人口千人あたりの陽性者比率` = p_rate,
`陽性者に対する死亡者比率` = d_rate)
rpd_by_prefs <- d_by_prefs %>%
dplyr::group_by(pref) %>%
dplyr::summarise(d = sum(n)) %>%
dplyr::left_join(r_by_pref, ., by = "pref") %>%
dplyr::select(pref, positive = n, deceased = d, population, p_rate = rate) %>%
dplyr::mutate(d_rate = round(deceased / positive, 2))
rpd_by_prefs %>%
dplyr::rename(`陽性者数` = positive, `死亡者数` = deceased,
`推計人口` = population, `人口千人あたりの陽性者比率` = p_rate,
`陽性者に対する死亡者比率` = d_rate)
都道府県別のデータから全国の日次集計を求める。
d_by_all <- d_by_prefs %>%
dplyr::group_by(date) %>%
dplyr::summarise(n = sum(n)) %>%
dplyr::ungroup() %>%
dplyr::mutate(diff = lagdiff(n), cum = cumsum(n), ma7 = ma7(n))
d_by_all
sec_scale <- 50
datetime <- lubridate::as_datetime(df_s$updated, tz = "Japan")
d_by_all %>%
ggplot2::ggplot(ggplot2::aes(x = date)) +
ggplot2::geom_bar(ggplot2::aes(y = n), fill = "dark gray", stat = "identity",
alpha = 0.25, width = 1.0) +
ggplot2::geom_line(ggplot2::aes(y = ma7), colour = "dark green",
linetype = "dashed", size = 0.25) +
ggplot2::geom_line(ggplot2::aes(y = cum / sec_scale), colour = "dark green") +
ggplot2::theme(legend.position = 'none') +
ggplot2::labs(title = paste0("Fixed scale @", datetime),
subtitle = subtitle, caption = caption,
x = "", y = "") +
ggplot2::scale_y_continuous(
name = "死亡者数・同移動平均(破線)",
sec.axis = ggplot2::sec_axis(~ . * sec_scale,
name = "累計死亡者数(実線)")
)